353 research outputs found

    Lattice Identification and Separation: Theory and Algorithm

    Get PDF
    Motivated by lattice mixture identification and grain boundary detection, we present a framework for lattice pattern representation and comparison, and propose an efficient algorithm for lattice separation. We define new scale and shape descriptors, which helps to considerably reduce the size of equivalence classes of lattice bases. These finitely many equivalence relations are fully characterized by modular group theory. We construct the lattice space L\mathscr{L} based on the equivalent descriptors and define a metric dLd_{\mathscr{L}} to accurately quantify the visual similarities and differences between lattices. Furthermore, we introduce the Lattice Identification and Separation Algorithm (LISA), which identifies each lattice patterns from superposed lattices. LISA finds lattice candidates from the high responses in the image spectrum, then sequentially extracts different layers of lattice patterns one by one. Analyzing the frequency components, we reveal the intricate dependency of LISA's performances on particle radius, lattice density, and relative translations. Various numerical experiments are designed to show LISA's robustness against a large number of lattice layers, moir\'{e} patterns and missing particles.Comment: 30 Pages plus 4 pages of Appendix. 4 Pages of References. 24 Figure

    Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

    Get PDF
    Low rank matrix approximation is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns and provides concise representations for the data. Research on low rank approximation usually focus on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrix. Here we are given a d×nd \times n binary matrix AA and a small integer kk. The goal is to find two binary matrices UU and VV of sizes d×kd \times k and k×nk \times n respectively, so that the Frobenius norm of A−UVA - U V is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The GF(2)\mathrm{GF}(2) model and the Boolean semiring model. Unlike low rank approximation of real matrix which can be efficiently solved by Singular Value Decomposition, approximation of binary matrix is NPNP-hard even for k=1k=1. In this paper, we consider the problem of Column Subset Selection (CSS), in which one low rank matrix must be formed by kk columns of the data matrix. We characterize the approximation ratio of CSS for binary matrices. For GF(2)GF(2) model, we show the approximation ratio of CSS is bounded by k2+1+k2(2k−1)\frac{k}{2}+1+\frac{k}{2(2^k-1)} and this bound is asymptotically tight. For Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of one low rank matrix are generated from Boolean formulas operating bitwise on columns of the data matrix. We show the approximation ratio of GCSS is bounded by 2k−1+12^{k-1}+1, and the exponential dependency on kk is inherent.Comment: 38 page

    Different effects of the probe summarization algorithms PLIER and RMA on high-level analysis of Affymetrix exon arrays

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing is an important mechanism that increases protein diversity and functionality in higher eukaryotes. Affymetrix exon arrays are a commercialized platform used to detect alternative splicing on a genome-wide scale. Two probe summarization algorithms, PLIER (Probe Logarithmic Intensity Error) and RMA (Robust Multichip Average), are commonly used to compute gene-level and exon-level expression values. However, a systematic comparison of these two algorithms on their effects on high-level analysis of the arrays has not yet been reported.</p> <p>Results</p> <p>In this study, we showed that PLIER summarization led to over-estimation of gene-level expression changes, relative to exon-level expression changes, in two-group comparisons. Consequently, it led to detection of substantially more skipped exons on up-regulated genes, as well as substantially more included (i.e., non-skipped) exons on down-regulated genes. In contrast, this bias was not observed for RMA-summarized data. By using a published human tissue dataset, we compared the tissue-specific expression and splicing detected by Affymetrix exon arrays with those detected based on expressed sequence databases. We found the tendency of PLIER was not supported by the expressed sequence data.</p> <p>Conclusion</p> <p>We showed that the tendency of PLIER in detection of alternative splicing is likely caused by a technical bias in the approach, rather than a biological bias. Moreover, we observed abnormal summarization results when using the PLIER algorithm, indicating that mathematical problems, such as numerical instability, may affect PLIER performance.</p

    Experiments on nonlinear extreme waves in complex configurations

    Get PDF
    Nonlinear rogue or freak waves are rare but utmost events in the ocean with at least twice the neighbouring wave heights. Understanding the properties of these events is crucial for ocean industry safety and expanding knowledge of the nonlinear hydrodynamic wave process. In addition to the dispersive focusing, i.e. the superposition of linear or nonlinear waves, it has recently been shown that both nonlinear focusing mechanisms play an important role in the description of rogue wave formation. In fact, the modulation instability (MI) is known to be active for the dimensionless depth value kh > 1.363, whereas the nonlinear wave shoaling takes place below this threshold value, and waves still being dispersive. The universal modulation instability (MI) is also responsible for the formation of strong wave localizations in other nonlinear media, such as optical Kerr media, plasma, and Bose- Einstein condensate. These wave-focusing dynamics can be described by breather solutions of the nonlinear Schrödinger equation (NLSE), such as the time-periodic Akhmediev breather(AB), space-periodic Kuznetsov-Ma breather (KB), and the space and time doubly-localized Peregrine breather (PB), which can seemingly "appear from nowhere and disappear without a trace". While the classical exact solutions of the NLSE have been extensively studied theoretically and experimentally, the integrability of the NLSE framework normally restricts them to the one-dimensional level and exact envelope shapes. Thus, the NLSE solutions differ from ordinary sea states, where wave shapes and directions are more arbitrary. In the current thesis, the role of wave nonlinearity in extreme wave formation beyond the analytical NLSE framework, i.e. under more complex wave configurations, was mainly investigated in two aspects: the effect of phase-shift and one particular case of wave field multi-directionality...(see the downloadable file for more

    Mathematical and Data-driven Pattern Representation with Applications in Image Processing, Computer Graphics, and Infinite Dimensional Dynamical Data Mining

    Get PDF
    Patterns represent the spatial or temporal regularities intrinsic to various phenomena in nature, society, art, and science. From rigid ones with well-defined generative rules to flexible ones implied by unstructured data, patterns can be assigned to a spectrum. On one extreme, patterns are completely described by algebraic systems where each individual pattern is obtained by repeatedly applying simple operations on primitive elements. On the other extreme, patterns are perceived as visual or frequency regularities without any prior knowledge of the underlying mechanisms. In this thesis, we aim at demonstrating some mathematical techniques for representing patterns traversing the aforementioned spectrum, which leads to qualitative analysis of the patterns' properties and quantitative prediction of the modeled behaviors from various perspectives. We investigate lattice patterns from material science, shape patterns from computer graphics, submanifold patterns encountered in point cloud processing, color perception patterns applied in underwater image processing, dynamic patterns from spatial-temporal data, and low-rank patterns exploited in medical image reconstruction. For different patterns and based on their dependence on structured or unstructured data, we present suitable mathematical representations using techniques ranging from group theory to deep neural networks.Ph.D
    • …
    corecore